Offsetting clock package pins in a clamshell topology to improve signal integrity

ABSTRACT

The disclosed embodiments relate to the design of a memory system which includes a set of one or more memory modules, wherein each memory module in the set has a clamshell configuration, wherein pairs of opposing memory packages containing memory chips are located on opposite sides of the memory module. The memory system also includes a multi-drop path containing signal lines which pass through the set of memory modules, and are coupled to memory packages in the set of memory modules. For a given signal line in the multi-drop path, a first memory package and a second memory package that comprise a given pair of opposing memory packages are coupled to the given signal line at a first location and a second location, respectively, wherein the first location and the second location are separated from each other by a distance d 1  along the given signal line.

BACKGROUND

1. Field

The disclosed embodiments relate to the design of memories for computersystems. More specifically, the disclosed embodiments relate to atechnique for offsetting clock package pins in a clamshell memorytopology to improve signal integrity.

2. Related Art

Modern memory systems often provide separate pathways forcommand/address signals and data signals. For example, some memorysystems provide a multi-drop “fly-by path” to rout command/addresssignals from a memory controller through multiple memory devices, and aseparate “direct path” to communicate data signals directly between thememory controller and the memory devices. Some memory systems thatprovide a fly-by path have a “clamshell” configuration, wherein pairs ofmemory packages containing memory chips are located on opposite sides ofeach memory module.

In such clamshell fly-by topologies, the two memory packages thatcomprise a given pair of opposing memory packages typically tap into asignal line on the fly-by path (such as a clock line) at a single sharedlocation, which is located toward the center of the memory packages.Moreover, the package pins used to carry such signals are typically alsolocated toward the center of the memory packages to reduce associatedstub lengths within the memory packages. However, tapping into a signalline at a single shared location effectively places a double load on thesignal line at the shared location. This double-load can worsenreflections that increase attenuation at high frequencies, and thesereflections can be problematic for high-speed clock signals used fordata timing.

Hence, what is needed is a method and an apparatus for reducing suchreflections in signal lines in clamshell fly-by topologies.

BRIEF DESCRIPTION OF THE FIGURES

FIG. 1 illustrates a computer system in accordance with the disclosedembodiments.

FIG. 2 illustrates a cross-sectional view of a memory module inaccordance with the disclosed embodiments.

FIG. 3 illustrates a technique for spreading a load on a signal line ina clamshell fly-by topology in accordance with the disclosedembodiments.

FIG. 4 illustrates an exemplary pin layout for a memory package inaccordance with the disclosed embodiments.

FIG. 5A illustrates another exemplary pin layout for a memory package inaccordance with the disclosed embodiments.

FIG. 5B illustrates how corresponding signal lines from opposing chipsconnect to a multi-drop signal line in a circuit board in accordancewith the disclosed embodiments.

FIG. 6 presents a flowchart illustrating the process of accessing amemory in accordance with the disclosed embodiments.

FIG. 7 presents a graph illustrating performance gained by offsettingclock pins in accordance with the disclosed embodiments.

FIG. 8A presents a graph illustrating frequency at a −10 dB designtarget as a function of clock pin offset distance in accordance with thedisclosed embodiments.

FIG. 8B presents a graph illustrating gain in comparison to the zerooffset case as a function of clock pin offset distance in accordancewith the disclosed embodiments.

DETAILED DESCRIPTION

The disclosed embodiments generally relate to techniques for improvingperformance in high-speed memory systems. More specifically, thedisclosed embodiments relate to a technique for reducing the attenuationof high-speed signals (such as clock signals) on a fly-by path in aclamshell fly-by memory topology. More specifically, the attenuation ofsuch high-speed signals can be reduced by separating pairs of clamshellloads so that the loads are more distributed along fly-by signal lines.This can be accomplished by moving the pins for each pair of clamshellloads in different directions on the fly-by path toward distal edges oftheir respective memory packages. When this is done, signal connectionsfor pairs of clamshell memory chips are maximally separated onassociated fly-by signal lines, while the separations between pinscarrying other lower-speed signals on the fly-by path are minimized(which facilitates reducing associated memory package stub lengths).Note that the term “memory package” in the disclosure and appendedclaims can refer to any type of packaging (e.g., plastic or ceramic)that surrounds a semiconductor chip. The term “memory package” can alsorefer to a “flip chip” that includes solder bumps on the top surface ofthe chip, which enables the chip to bond to external circuitry when thechip is oriented face down.

Before describing this technique in more detail, we first describestructural features of a computer system that uses this technique.

Computer System

FIG. 1 illustrates a computer system 100 in accordance with thedisclosed embodiments. Computer system 100 can generally include anytype of computer system or computing device, including, but not limitedto, a computer system based on a microprocessor, a mainframe computer, adigital signal processor, a portable computing device, a personaldigital assistant, a cell phone, a device controller, or a computationalengine within an appliance.

Computer system 100 includes a processor 106, which operates on code anddata accessed from a memory system, which includes memory controller 104and memory module 102. Processor 106 can generally include any type ofsingle-core or multi-core processor. Moreover, memory controller 104 cangenerally include any type of circuitry for controlling accesses tomemory chips located on one or more memory modules, such as memorymodule 102. Memory controller 104 can alternatively be incorporated intoa functional unit within processor 106 itself Note that although only asingle memory module 102 is illustrated in FIG. 1, computer system 100can generally include multiple memory modules.

As illustrated in FIG. 1, memory module 102 has a clamshell design,wherein pairs of opposing memory packages are located on opposite sidesof a circuit board. Moreover, memory module 102 supports a fly-bytopology, wherein command and address signals from memory controller 104are communicated to the memory chips through a number of fly-by pathsincluding fly-by path 108. At the same time, data signals arecommunicated directly between memory controller 104 and the memory chipsthrough a number of direct paths between the memory controller and thememory chips, including direct path 110.

Memory Module

FIG. 2 illustrates a cross-sectional view of memory module 102 inaccordance with the disclosed embodiments. Memory module 102 includes acircuit board 202, which for example can be located in the chassis of acomputer system. As mentioned above, memory module 102 has a clamshellconfiguration in which pairs of memory packages are mounted on oppositesides of circuit board 202. For example, in FIG. 2 memory packages 204and 206 are mounted on opposite sides of circuit board 202. Note thatmemory packages 204 and 206 are electrically coupled to traces (notshown) within circuit board 202 through a number of solder ballconnections. Each memory package includes one or more memory chips. Forexample in FIG. 2, memory package 204 includes memory chip 208. As isillustrated in FIG. 2, the memory chips are electrically coupled totheir respective chip packages through center I/Os on the memory chips.Note that different types of memory devices may be used in memory module102, for example, memory devices adhering to double data rate (DDR)standards, such as DDR2, DDR3, and DDR4, and future generations ofmemory devices, such as GDDR5, XDR, Mobile XDR, LPDDR, and LPDDR2. Also,other types of non-DDR clocked memory devices can be used.

Load Spreading

FIG. 3 illustrates a technique for spreading a load across a signal linein a clamshell fly-by topology in accordance with the disclosedembodiments. The top portion of FIG. 3 illustrates a conventionalclamshell fly topology, wherein a signal line in a fly-by path passesthrough four pairs of opposing memory packages. Associated signals fromthe chips that comprise each pair tap into this signal line at the sameshared location which is proximate to the centers of each of the memorychips. (These taps are illustrated as solid and dashed circles, with thesolid circles illustrating taps from the front-side memory packages andthe dashed circles illustrating taps from the back-side memorypackages.) Note that the connections from each chip are located as closeas possible to the centers of the memory chips to reduce stub lengths inthe memory chip packages. Also note that the connections from the chipsfeed through vias within the memory module circuit board and ultimatelyconnect to the signal line at a single shared location in the circuitboard.

In contrast, the bottom portion of FIG. 3 illustrates how the associatedconnections from the pairs of chips can alternatively connect with thesignal line at two separate locations. This effectively spreads out theload for these connections across the signal line, which in turn canimprove performance for high-frequency signals, such as clock signals,on the fly-by path. In an ideal configuration, the distance d₁ betweenconnections for corresponding signal lines from pairs of chips issubstantially half of a spacing d₂ between successive pairs of memorypackages along the given signal line. In this way, the load can bespread out along the signal line to improve signal integrity at highfrequencies.

Pin Layout

FIG. 4 illustrates an exemplary DDR3 pin layout for a memory package inaccordance with the disclosed embodiments. Note that possible pinlocations near the center of the memory package do not contain pinsbecause the space is consumed by wiring that connects center I/Os fromthe memory chip with the memory package. (This wiring is illustrated inthe cross-sectional view which appears in FIG. 2.)

The pins illustrated in FIG. 4 carry a number of different types ofsignals. (Note that FIG. 4 illustrates an exemplary DDR3 pin out for anx8 DRAM.) A first group of pins carry data signals, such as DQ0-DQ7,which as mentioned above are coupled to the memory controller through adirect path. A second group of pins carry various data-related signals(e.g., strobes), such as DQS, DQS#, DM/TDQS and NU/TDQS. A third groupof pins carry various address/command signals, such as RAS#, CAS#, WE#,BA0-BA2, A0-A13, AP and BC#. A fourth group of pins carry clock signals,such as CK and CK#. A fifth group of pins carry various per-rank controlsignals, such as ODT, CS# and CKE. A sixth group of pins carrypower-related signals, such as VSS, VDD, VDDQ and VSSQ. Finally, aseventh group of pins carry miscellaneous non-power signals, such as ZQ,VREFCA and RESET#.

Note that address and data signals can be paired between opposing chipsas is illustrated by the curved dashed lines connecting DQ4 to DQ7 andA2 to A1 in FIG. 4. When these opposing chips are mounted on a circuitboard, the pins from opposite sides of the line of symmetry are locatedover each other. This allows the pins to be easily coupled togetherthrough a via in the circuit board, wherein the via is electricallycoupled to a corresponding signal line in either the fly-by path or thedirect path. The paired address and data pins are “functionallyequivalent” in the first and second memory packages. This means that thepins can be permuted differently for each memory package withoutaffecting operation of the memory system. Note that address pins as wellas data pins can be permuted differently for each memory chip withoutaffecting how the memory system operates. For example, if address pinsare permuted, a given address pattern will always reference the samememory location, so the memory will continue to operate in the samemanner. Similarly, if data pins are permuted, data bits may be stored atdifferent locations in a memory device, but when the data bits are readout through the permuted pins, they will be restored to their originalorder.

Moreover, in this example a number of clock and strobe signals, such asDQS, DQS#, CK and CK#, are located near the respective centers ofassociated pairs of opposing memory packages. This is done to reduceassociated stub lengths within the memory packages and to therebyimprove system performance. At the same time, other less time-criticalsignals (such as VSS, VDD, VSSQ and VDDQ), or per-rank signals (such asODT and CS#) which are not paired with corresponding signals in theopposing chip, are pushed away from the central location and toward theedge of the chip.

In an alternative embodiment (not shown), opposing memory packages areconfigured to have a partially mirrored pin configuration, wherein someopposing “mirrored” pins in the first and second memory packages areconfigured to carry the same signals for each chip. This enables thesemirrored pins to be coupled together through vias in the circuit board,wherein the vias can be electrically coupled to corresponding signallines in either the fly-by path or the direct path. Note that themirrored mode can be activated for a memory chip by writing to a controlregister 522 in the memory chip 552, or alternatively by settingappropriate voltages on special mirror-mode configuration pins on thememory chip.

FIG. 5A illustrates another exemplary pin layout for a memory package inaccordance with the disclosed embodiments. Similar to the layoutillustrated in FIG. 4, in the layout illustrated in FIG. 5A, the pinscarry a number of different types of signals. A first group of pinscarry data signals, such as DQ0-DQ7. A second group of pins carryvarious data-related signals (e.g., strobes), such as EDC and DBI. Athird group of pins carry various command/address signals, such asCA0-CAS. A fourth group of pins carry clock signals, such as DCLK,DCLK#, CK and CK#. A fifth group of pins carry various per-rank controlsignals, such as ODT, CS# and CKE. A sixth group of pins carrypower-related signals (not shown). Finally, a seventh group of pinscarry miscellaneous non-power signals, such as VREF_(CA), VREF_(DQ) andRESET#.

Note that the command/address signals CA0-CA5 are located close to thecenter of the memory chip to minimize associated stub and trace lengths.

Also note that that data pins DQ0-DQ1 and DQ4-DQ5 are mirrored acrossthe line of symmetry. When the opposing packages are mounted on thecircuit board, the mirrored pins from opposite sides of the line ofsymmetry are located over each other. This allows the pins to be easilycoupled together through a via in the circuit board, wherein the via iselectrically coupled to a corresponding signal line in either the fly-bypath or the direct path. Note that coupling the signal lines in this waypermutes the data pins. However, permuting the data pins between themirrored chips in this way does not affect how the memory systemoperates.

In the embodiment illustrated in FIG. 5A, some pins carrying high-speedsignals, such as DCLK and DCLK#, are pushed toward the edge of the chippackage in order to space their loads apart on an associated signal linein the fly-by path. By doing this, the corresponding signals fromopposing memory packages are coupled to the given signal line atdifferent locations which are separated by a distance d₁. As mentionedabove, the distance d₁ is ideally half of a spacing d₂ betweensuccessive pairs of opposing memory packages along the signal line. Inthis way, the coupling locations and associated loads for individualmemory packages are distributed along the signal line.

More specifically, FIG. 5B illustrates how corresponding signal linesfrom opposing chips connect to a multi-drop signal line 562 within acircuit board 562 in accordance with the disclosed embodiments. Morespecifically, a signal line 565 from memory chip 564 connects tomulti-drop signal line 568 and a first location 569. At the same time, acorresponding signal line 567 from memory chip 566 connects tomulti-drop signal line 568 at a second location 563. Note that the firstlocation 569 and the second location 563 are separated by a distance d₁along multi-drop signal line 568.

Process of Accessing A Memory

FIG. 6 presents a flowchart illustrating the process of accessing amemory system in accordance with the disclosed embodiments. First,during a memory reference, the memory controller sends address signals,clock signals and control signals to the memory modules through a fly-bypath between the memory controller to the set of memory modules. Duringthis process, a signal line in the fly-by path carries a high-speedsignal (such as a data clock signal) through opposing pairs of memorypackages, wherein the memory packages in each pair of memory packagesare coupled to the signal line at different locations. Moreover, thedifferent locations are separated from each other by a distance d₁ alongthe signal line (step 602). In one embodiment, the distance d₁ issubstantially half of a spacing d₂ between successive pairs of opposingmemory packages along the given signal line, whereby the resultingcoupling locations and associated loads for memory packages aredistributed along the given signal line.

While the signals are being communicated through the fly-by path, thememory controller communicates data signals to the memory modulesthrough a direct path, which couples data lines from the memorycontroller directly to memory packages in the memory modules (step 604).

Performance

FIG. 7 presents a graph illustrating simulation results which indicatethe performance gained by using offset clock pins in accordance with thedisclosed embodiments. The graph in FIG. 7 presents a transfer function(in dB) for a fly-by signal line with centered clock pins 702, and alsofor a fly-by signal line with offset clock pins 704. Moreover, thedashed horizontal line in the graph illustrates a “design target” of −10dB. Note that using offset clock pins increases the transfer function byabout 2 dB. This translates into an increase in clock speed from 1.57GHz to 1.86 GHz at the design target of −10 dB. Also note that theperformance gained by offsetting the clock pins and spreading out theload exceeds the performance lost due to the increased via loading (twovias versus one via for each pair of clamshell packages.)

Moreover, the above-described clock signal can be one of a number of“quadrature” clock signals that operate at a fraction of the aggregatefrequency of the memory system. For example, if the memory systemoperates at a data rate of 8.0 Gbps, a corresponding quadrature clocksignal can operate at a slower 2.0 GHz frequency (instead of a 4.0 GHzfrequency).

Furthermore, as memory speeds continue to increase, it will becomeadvantageous to similarly spread out the load for other slower,non-clock-related signals, such as other command and address signals inthe fly-by path.

FIG. 8A presents a graph illustrating performance (expressed asfrequency at the −10 dB design target) as a function of clock pin offsetdistance in accordance with the disclosed embodiments. Note that theclock pin offset distance is expressed in units of pitch, which involvesdividing d₁ by d₂. Hence, an offset distance of 1.0 indicates that d₁=d₂, and an offset distance of 0 indicates that signal lines from a pairopposing memory chips intersect the fly-by signal line at the sameplace. Moreover, when the offset distance equals 0.5 the intersections(and hence the associated loads) from the pair of opposing memory chipsand adjacent pairs of chips are maximally distributed. Note that as theoffset distance grows larger than 0.5, a clock line intersection from afirst pair of memory modules becomes closer to a clock line intersectionfrom a successive second pair of memory modules, until the clock lineintersections from the first and second pairs of memory moduleseventually touch when the offset distance equals 1.0. Note that as theoffset distance increases from 0 to 0.5, the frequency at the designtarget of −10 dB increases by about 1.9 dB.

FIG. 8B presents a graph illustrating gain in comparison to the zerooffset case as a function of clock pin offset distance in accordancewith the disclosed embodiments. As indicated in this graph, an offset of0.125 in pitch, produces a gain of 0.84 dB, an offset of 0.25 in pitchproduces a gain of 2.18 dB, an offset of 0.375 in pitch produces a gainof 3.07 dB, and finally, an offset of 0.50 in pitch produces a gain of3.68 dB.

The preceding description was presented to enable any person skilled inthe art to make and use the disclosed embodiments, and is provided inthe context of a particular application and its requirements. Variousmodifications to the disclosed embodiments will be readily apparent tothose skilled in the art, and the general principles defined herein maybe applied to other embodiments and applications without departing fromthe spirit and scope of the disclosed embodiments. Thus, the disclosedembodiments are not limited to the embodiments shown, but are to beaccorded the widest scope consistent with the principles and featuresdisclosed herein. Accordingly, many modifications and variations will beapparent to practitioners skilled in the art. Additionally, the abovedisclosure is not intended to limit the present description. The scopeof the present description is defined by the appended claims.

Also, some of the above-described methods and processes can be embodiedas code and/or data, which can be stored in a computer-readable storagemedium as described above. When a computer system reads and executes thecode and/or data stored on the computer-readable storage medium, thecomputer system performs the methods and processes embodied as datastructures and code and stored within the computer-readable storagemedium. Furthermore, the methods and processes described below can beincluded in hardware modules.

For example, the hardware modules can include, but are not limited to,application-specific integrated circuit (ASIC) chips, field-programmablegate arrays (FPGAs), and other programmable-logic devices now known orlater developed. When the hardware modules are activated, the hardwaremodules perform the methods and processes included within the hardwaremodules.

What is claimed is:
 1. A memory system, comprising: a circuit boardhaving one or more pairs of opposing memory chips which are located onopposite sides of the circuit board in a clamshell configuration; and amulti-drop path for at least one signal line coupled to each memorychip; wherein a first memory chip and a second memory chip are coupledto the at least one signal line at a first location and a secondlocation on the signal line, respectively, wherein the first locationand the second location are separated from each other by a distance d₁along the signal line.
 2. The memory system of claim 1, wherein thedistance d₁ is substantially half of a spacing d₂ between successivepairs of opposing memory chips along the given signal line, wherebycoupling locations and associated loads for individual memory chips aredistributed along the signal line.
 3. (canceled)
 4. The memory system ofclaim 1, wherein the first and second memory chips are contained withina first and a second memory package, respectively; and wherein foranother signal line in the multi-drop path, the first and second memorychips in each pair of opposing memory chips are coupled to the othersignal line at the same shared location. 5-10. (canceled)
 11. A memorymodule, comprising: a circuit board having one or more pairs of opposingmemory chips which are located on opposite sides of the circuit board ina clamshell configuration; wherein the circuit board is configured tosupport a multi-drop path for at least one signal line coupled to eachmemory chip; and wherein a first memory chip and a second memory chipare coupled to the at least one signal line at a first location and asecond location on the signal line, respectively, wherein the firstlocation and the second location are separated from each other by adistance d₁ along the signal line.
 12. The memory module of claim 11,wherein the distance d₁ is substantially half of a spacing d₂ betweensuccessive pairs of opposing memory chips along the given signal line,whereby coupling locations and associated loads for individual memorychips are distributed along the signal line.
 13. The memory module ofclaim 11, wherein the spacing between the first and second locations isfacilitated by locating associated pins on a first and a second memorypackage for the first and second memory chips in opposite directionsfrom respective centers of the first and second memory packages.
 14. Thememory module of claim 11, wherein the first and second memory chips arecontained within a first and a second memory package, respectively; andwherein for another signal line in the multi-drop path, the first andsecond memory packages in each pair of opposing memory packages arecoupled to the other signal line at the same shared location.
 15. Thememory module of claim 14, wherein the shared location is proximate torespective centers of the first and second memory packages when thefirst and second memory packages are mounted on opposite sides of thecircuit board; and wherein associated pins on the first and secondmemory packages are located in proximity to respective centers of thefirst and second memory packages to reduce associated stub lengths andtrace lengths within the first and second memory packages and within thecircuit board.
 16. The memory module of claim 14, wherein the first andsecond memory packages are configured to have a partially mirrored pinconfiguration wherein some opposing pins in the first and second memorypackages are associated with the same signal; and wherein the othersignal line in the multi-drop path is coupled to opposing pinsassociated with the same signal in the first and second memory packages.17. The memory module of claim 14, wherein the other signal line in themulti-drop path is coupled to functionally equivalent pins in the firstand second memory packages, wherein the functionally equivalent pins canbe permuted differently for each memory package without affectingoperation of the memory system; and wherein the other signal line in themulti-drop path is coupled to functionally equivalent opposing pins inthe first and second memory packages.
 18. The memory module of claim 11,wherein the multi-drop path is a fly-by path; and wherein the memorysystem further comprises a direct path which couples data lines directlyto memory chips on the circuit board.
 19. The memory module of claim 11,wherein the signal lines in the multi-drop path contain address lines,clock lines and control lines.
 20. The memory module of claim 19,wherein the given signal line is a clock line. 21-25. (canceled)
 26. Amethod for accessing a memory, comprising: directing memory accessesthat arise during execution of a program by a processor to at least onecircuit board having one or more pairs of opposing memory chips whichare located on opposite sides of the circuit board in a clamshellconfiguration; wherein directing the memory access involves sendingaddress signals, clock signals and control signals to the circuit boardthrough a multi-drop path containing at least one signal line coupled toeach memory chip; and wherein a first memory chip and a second memorychip are coupled to the signal line at a first location and a secondlocation on the signal line, respectively, wherein the first locationand the second location are separated from each other by a distance d₁along the signal line.
 27. The method of claim 26, wherein directing thememory access also involves communicating data signals through a directpath which couples data lines from the memory controller directly tomemory chips on the circuit board.
 28. The method of claim 26, whereinthe distance d₁ is substantially half of a spacing d₂ between successivepairs of opposing memory chips along the given signal line, wherebycoupling locations and associated loads for individual memory chips aredistributed along the given signal line. 29-30. (canceled)
 31. A chippackage, comprising: a semiconductor chip; routing circuitry to connectinternal signals from the semiconductor chip to I/O pins on the chippackage; and wherein in a non-mirrored mode, the routing circuitry doesnot change the routing of internal signals to I/O pins; and wherein in amirrored mode, the routing circuitry swaps the routing of internalsignals for at least one pair of I/O pins across an axis of symmetry forthe chip package, so that when the chip package is mounted on anopposite side of a circuit board from an opposing chip package, at leastone pair of opposing I/O pins in the chip package and the opposing chippackage are associated with the same signal.
 32. The semiconductor chipof claim 31, wherein the routing circuitry includes a configurationregister to select between the mirrored mode and the non-mirrored mode.33. A chip package, comprising: I/O pins coupled to a semiconductor chiplocated within the chip package, wherein the I/O pins include a firstset of I/O pins and a second set of I/O pins; wherein the first set ofI/O pins are located in proximity to an axis of symmetry of the chippackage to reduce associated stub lengths and trace lengths within thechip package and within a circuit board when the chip package is mountedin a clamshell configuration, wherein an opposing chip package ismounted on an opposite side of the circuit board from the chip package;and wherein the second set of I/O pins are located away from the axis ofsymmetry, so that when the chip package is mounted in the clamshellconfiguration, an I/O pin in the second set of I/O pins, and acorresponding I/O pin on the opposing chip package which carries thesame signal, are coupled to a signal line in the circuit board at afirst location and a second location on the signal line, respectively,wherein the first location and the second location are separated fromeach other by a distance d₁ along the signal line.
 34. The chip packageof claim 33, wherein the distance d₁ is substantially half of a spacingd₂ between successive pairs of opposing chip packages along the signalline, whereby coupling locations and associated loads for individualchip packages are distributed along the signal line.